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Abstract 

In 1948, W. Hoeffding introduced a large class of unbiased estima- 
tors called [/-statistics, defined as the average value of a real- valued 
m-variate function h calculated at all possible sets of m points from a 
random sample. In the present paper, we investigate the correspond- 
ing robust analogue which we call [/-quantile-statistics. We are con- 
cerned with the asymptotic behavior of the sample p-quantile of such 
function h instead of its average. Alternatively, CZ-quantile-statistics 
can be viewed as quantile estimators for a certain class of dependent 
random variables. Examples are given by a slightly modified Hodges- 
Lehmann estimator of location and the median interpoint distance 
among random points in space. 

Keywords: robust, [/-statistics, [/-max-statistics, dependent, sample 
quantile, Hodges-Lehmann 



1 Introduction 

[/-statistics form a very important class of unbiased estimators for distri- 
butional properties such as moments or Spearman's rank correlation. A 
[/-statistic of degree m with symmetric kernel h is a function of the form 

f/n(ei,...,en)= (l-l) 

where the sum is over J = {{ii, . . . , im) : 1 < '^i < ■ ■ ■ < < n}, . . . , ^„ 
are random elements in a measurable space S, and h is a real-valued Borel 
function on 5™, symmetric in its m arguments. In his seminal paper, Ho- 
effding [5] defined [/-statistics for not necessarily symmetric kernels and for 
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random points in d-dimensional Euclidean space M'^. Later the concept was 
extended to arbitrary measurable spaces. Since 1948, most of the classical 
asymptotic results for sums of i.i.d. random variables have been formulated 
in the setting of t/-statistics, such as central limit laws, strong laws of large 
numbers, Berry-Esseen type bounds, and laws of the iterated logarithm. 

In this article we replace the average in fll.ip by the sample pth quantile 
Hpn and study its asymptotic distribution. By e.g. replacing the average by 
the median {p = 1/2), ordinary [/-statistics are robustified in a natural way. 

For any distribution function F, the pth quantile, < p < 1, is given by 

Hp = inf{x : F{x) > p}, 

which satisfies the inequality 

F{Hp-)<p<F{Hp). 

Here, the sample pth quantile Hpn is defined as the pth quantile of the em- 
pirical distribution function of the sequence of dependent random variables 

{^tei,---,6„J, 1 < n < ^2 < ■• ■ < «m < ri}, (1.2) 

i.e. Hpn is a value that separates the lowest 100p% random variables in (11. 2p 
from the rest. 

Under mild smoothness conditions on the distribution function F of 
h{^i-^, . . . ,C,i^), we proof asymptotic normality for this class of estimators 
for < p < 1. The exceptions p = and p = 1, corresponding to the 
extreme values of the dependent sequence (11.21) . were already investigated 
in Lao and Mayer [6]. For bounded kernels, they established Weibull limit 
laws for these so called ?7-max-statistics. Their results are mainly based on 
a Poisson approximation theorem for [/-statistics, see e.g. Barbour et al. [T]. 

In Section [2] we present the main result of the article and discuss asymp- 
totic relative efficiency of a general [/-quantile-statistic with respect to the 
corresponding ordinary [/-statistic. The proof of the main result is shown in 
Section [31 In Section H] we apply our results to show asymptotic normality for 
both a modification and a generalization of the well-known Hodges-Lehmann 
estimator of location. As a second application, we describe the limiting be- 
havior of the median interpoint distance among a random sample of points 
in Euclidean space. 
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2 Asymptotic normality 



Asymptotic normality of Hpn is stated in the main result of this article. 

Theorem 2.1. Let ^i,...,^n be i.i.d. S-valued random elements and 
h : S"^ ^ M a symmetric Borel function. Assume that the distribution 
function F of . . . , C,m) is continuous at Hp. Left- and right-hand deriva- 
tives of F at Hp are denoted byF\Hp—) and F\Hp+), respectively, provided 
they exist. Put 

C = P {hi^u ...,U)< Hp, h{^u U+u • • • , < 4} - p'. (2.1) 

Then, for < p < 1 and C > 0, 
(i) If there exists F'{Hp-) > 0, then for t <0, 



1 / rT^ 



^ mC-^/F'{Hp-) - ^ 



(ii) If there exists F'{Hp+) > 0, then for t > 0, 

- " ^ mC2/F'{Hp+) - ^ 



As an immediate consequence of Theorem 12.11 the following result holds. 

Corollary 2.2. If F in Theorem \2.1\ possesses a density f in a neighborhood 
of Hp and f is positive and continuous at Hp, then 

-Hp) ^ 



n— ►oo 



The continuity assumption for / is required, as otherwise, / could differ 
from F' on a set with mass. 

Remark 1. For 5 = M, by conditioning on the common random element .^i, 
( of (12.11) can be written as 

C = J {p{hix,^2,...,U)<Hp]ydGix)-p\ 

where G is the distribution function of ^i. 
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Remark 2. By setting m = 1, i.e. if /i is a function from S to M, Theorem 12. II 
implies Theorem A (p. 77) of Serfling [7j on the asymptotic normahty of the 
sample quantiles for i.i.d. random variables. 

Remark 3. By the central limit theorem for ordinary ^/-statistics (see e.g. Ho- 
effding [5J or Serfling [7]), we are able to compare the asymptotic efficiency of 
the (robust) ?7-quantile-statistic (for p = |) with the asymptotic efficiency of 
Un given by (11. ip . Assume that the assumptions of Corollary 12. 21 are fulfilled. 
Furthermore, assume that the density / is symmetric about /x, the random 
variables in (11.21) have finite variance and 



Then, the ordinary ^/-statistic [/„ based on kernel h is asymptotically normal 
with variance ni?Ci. Hence, by Corollary 12.21 



We follow the proof of asymptotic normality of the usual pth quantile by 
Serfiing (p. 78f) [7j, with the necessary adaptions. 

Proof of Theorem \2.1\ For fixed t write 



Cl = E((/l(^i,...,em) -/U)(/l(^i,^^+i,...,6„_l) -/i)) > 0. 



3 Proof of Theorem 12.1 




P{p< f/„(A„4)}, 



(3.1) 



where A is a constant specified later and 




is an ordinary ^/-statistic with expectation 



A, 



nt 



...,^iJ<Hp + tAn--2 



} 



By continuity of F at Hp, A 



■nt 



p as n 



oo. 
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Since the kernel of [/„(A„f) is either or 1, the third absolute moment A 
of f/„(A„t) exists. Furthermore, by continuity of probability functions (see 
e.g. p. 351 in Serfling ^T]) and continuity of F at Hp, the quantity 

converges to its limit C > as n-^ oo. Thus, for the normalized ^/-statistic 

rr.^. , _ {Un{Ant) - AnO 
^nK^nt) — 1 ; 

mQ 

by the Berry- Esseen theorem for [/-statistics by Callaert and Janson [2], 

C\ 

sup |p {f/:(A„,) < t} - m\ < . .,,3/2 (3-2) 

holds at least asymptotically as n— > oo for an universal constant < C < oo. 
From (13.11) it follows that 

G,.(i)^p( "*'P-,^"'' <C7;(A,.,) 

= P{f/:(A„i) > -c„J 

with 

_ na ( A^f - p) 

Clearly, 

$(t) - G„(t) = p {u:{Kt) < -cr^t} - (1 - 

= P {t/:(A„i) < -c„J - $(-c„0 + <^{t) - <l>(c„0, 
and thus, by using the Berry-Esseen bound (13.21) . 

\Gnit) - m\ < + \m - $(c„,)|. 

The first term on the right hand side vanishes as n— > oo. It thus remains to 
show Cnt ^ t as n—^ oo. By 

_ {Ant -p) 
mQ 



nt 



tA F{Hp + tAn~2) - F{Hj 



mQ 



1 < _i 

2 tAn 2 
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it follows, for t > as n— > oo, 

Cnt 

Similarly, for t < as oo, 



tAF'{Hp+] 



Cnt 



tAF'{Hp- 



Choosing 



if t > and 



A 



A 



if t < 0, the claimed result follows. □ 



4 Examples 

4.1 The Hodges-Lehmann estimator of location 

As an application of Theorem 12 . 1 1 (resp . Corollary [22]), we deduce asymptotic 
normality of a slightly modified version of the Hodges-Lehmann estimator |4| 
of location and of a generalization. The Hodges-Lehmann estimator is given 
by the median of all Walsh averages 



6 + 



1 < i < j < n 



and estimates the location parameter associated with the one-sample Wilcoxon 
test, see e.g. Hettmansperger [3J. 

If the Walsh averages with i = j are dropped from the original definition 
of the Hodges-Lehmann estimator, this modification can be expressed easily 
f/-quantile-statistic with p = \ and kernel 

h{x,y) = {x + y)/2. 

Let ^1, . . . be i.i.d. random variables with distribution G and square 
integrable and continuous density g, symmetric about say, and g{0) > 0. 
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Then h{C,i,C,j), ^ ^ i < j n, with common d.f. F have continuous density 
/ and /(O) > and thus Corollary 12.21 can be applied directly. Clearly 

F{z) = P 1^^^ < ^1 = y P {Ci <2z-x} g{x)dx. 
Thus, by symmetry, 

/(O) = F'(0) = 2 J g{xfdx. 

The value of C, is found easily by Remark [TJ 

C + l/4 = y (P {i2<x}fg{x)dx 
= ^{G\X)) 



2 



= Et/ 

for a standard uniformly distributed random variable U. Thus, C, = 1/3 — 
1/4 = 1/12 > 0. 

Corollary 12.21 ensures asymptotic normality with mean and variance 



which equals the corresponding result for the Hodges-Lehmann estimator, 
see Hettmansperger [3], p. 37. 

In the same way, the asymptotic distributions for f/-quantile-statistics 
with kernels 

m 

h{xi, . . . ,Xm) = m~^^Xi, m > 2, 

i=l 

can be established by plugging 

/(O) = m ... g{xi H h Xm-i)g{xi) ■ ■ ■ g{xm-i)dxi ■ ■ ■ dx^-i 



and 

C + 1/4 



2 

^(xi H h x^_i)g{x2) ■ ■ ■ g{x„,^i)dxi ■ ■ ■ dx^^i ) g{xi)dxi 



into Corollary 12.21 
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4.2 Median interpoint distance 

A geometric example of a t/-quantile-statistic is given by the sample median 
On of all interpoint distances — (with theoretical median 9) of a sample 
of i.i.d. points ^i, . . . with continuous density g in M"^, d >1. Distances 
are measured with respect to any fixed norm || • || on W^. The closed unit 
ball induced by this norm is denoted by B'^ with surface S*^"^ and we write 

{x + ^B"^} = {yeW^: \\y - x\\ < 6} and 
{x + eS'^-^} = {yeR'^: \\y - a;|| = 9}. 

The asymptotic normal distribution of ^„ is established by Corollary 12.21 and 
Remark [H The value of ( is found via 

C + l/4= / iP{U,-x\\<e}fgix)dx 

(P{6 e {x + OM'^jjY g{x)dx 

9{y)dy] g{,x)dx, 



Vj{x+6»B<*} / 

whereas the density / of the random interpoint distance ||^i — ^2]! at 9 is 
given by 

fie)de = p{e<u,-^,\\<e + de} 

P{0< Ui -x\\<e + dO} g{x)dx 



dO [ / 9{y)dy ] g{x)dx, 



hence 



fiO) = I { g{y)dy ) g{x)dx. 



Corollary 12.21 ensures asymptotic normality. 
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